Active selection with label propagation for minimizing human effort in speaker annotation of TV shows
نویسندگان
چکیده
In this paper an approach minimizing the human involvement in the manual annotation of speakers is presented. At each iteration a selection strategy choses the most suitable speech track for manual annotation, which is then associated with all the tracks in the cluster that contains it. The study makes use of a system that propagates the speaker track labels. This is done using a agglomerative clustering with constraints. Several different unsupervised active learning selection strategies are evaluated. Additionally, the presented approach can be used to efficiently generate sets of speech tracks for training biometric models. In this case both the length of the speech track for a given person and its purity are taken into consideration. To evaluate the system the REPERE video corpus was used. Along with the speech tracks extracted from the videos, the optical character recognition system was adapted to extract names of potential speakers. This was then used as the ’cold start’ for the selection method.
منابع مشابه
An Active Learning Method for Speaker Identity Annotation in Audio Recordings
Given that manual annotation of speech is an expensive and long process, we attempt in this paper to assist an annotator to perform a speaker diarization. This assistance takes place in an annotation background for a large amount of archives. We propose a method which decreases the intervention number of a human. This method corrects a diarization by taking into account the human interventions....
متن کاملPartition sampling: an active learning selection strategy for large database annotation
Annotating a video database requires an intensive, time consuming and error prone human effort. However, this is a mandatory task to efficiently analyze multimedia contents. We propose an new selection strategy for active learning methods to minimize human effort in labeling a large database of video sequences. Formally, active learning is a process where new unlabeled samples are iteratively s...
متن کاملActive Frame Selection for Label Propagation in Videos
Manually segmenting and labeling objects in video sequences is quite tedious, yet such annotations are valuable for learning-based approaches to object and activity recognition. While automatic label propagation can help, existing methods simply propagate annotations from arbitrarily selected frames (e.g., the first one) and so may fail to best leverage the human effort invested. We define an a...
متن کاملCombining Active Learning and Partial Annotation for Domain Adaptation of a Japanese Dependency Parser
The machine learning-based approaches that dominate natural language processing research require massive amounts of labeled training data. Active learning has the potential to substantially reduce the human effort needed to prepare this data by allowing annotators to focus on only the most informative training examples. This paper shows that active learning can be used for domain adaptation of ...
متن کاملCombining Active Learning and Partial Annotation for Japanese Dependency Parsing
The machine learning-based approaches that dominate natural language processing research require massive amounts of labeled training data. Active learning has the potential to substantially reduce the human effort needed to prepare this data by allowing annotators to focus on only the most informative training examples. This paper shows how active learning can be used for domain adaptation of d...
متن کامل